An Improved Hash-based Join Algorithm in the Presence of Double Skew on a Hypercube Computer

نویسندگان

  • Shao Dong Chen
  • Hong Shen
  • Rodney Topor
چکیده

This paper presents an improved parallel hash-based join algorithm on a hypercube computer in the presence of double skew. We describe a load balancing technique to evenly distribute both join relations across all processors in order to deal with double skew eeectively. Moreover, we propose a permutation join method which reduces main memory requirement for the local join operation in the previous method presented in 8] and speeds up the local join operation in each processor by using a two-index join method. Our algorithm partitions the relations into a number of buckets, evenly distributes each bucket to every processor by using the load balancing technique, and then joins each pair of matching buckets simultaneously by using the permutation join method. The performance analysis of the algorithm shows that our algorithm is more eecient than the previous algorithm 8] in the presence of double skew.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود به‌روزرسانی پایگاه داده تحلیلی نیمه‌آنی

Near-real time data warehouse gives the end users the essential information to achieve appropriate decisions. Whatever the data are fresher in it, the decision would have a better result either. To achieve a fresh and up-to-date data, the changes happened in the side of source must be added to the data warehouse with little delay. For this reason, they should be transformed in to the data wareh...

متن کامل

Skew-insensitive parallel algorithms for relational join

Join is the most important and expensive operation in relational databases. The parallel join operation is very sensitive to the presence of the data skew. In this paper, we present two new parallel join algorithms for coarse-grained machines, which work optimally in presence of arbitrary amount of data skew. The first algorithm is sort-based and the second is hash-based. Both of these algorith...

متن کامل

Hypercube Bivariate-Based Key Management for Wireless Sensor Networks

Wireless sensor networks are composed of very small devices, called sensor nodes,for numerous applications in the environment. In adversarial environments, the securitybecomes a crucial issue in wireless sensor networks (WSNs). There are various securityservices in WSNs such as key management, authentication, and pairwise keyestablishment. Due to some limitations on sensor nodes, the previous k...

متن کامل

An Improved Hash Function Based on the Tillich-Zémor Hash Function

Using the idea behind the Tillich-Zémor hash function, we propose a new hash function. Our hash function is parallelizable and its collision resistance is implied by a hardness assumption on a mathematical problem. Also, it is secure against the known attacks. It is the most secure variant of the Tillich-Zémor hash function until now.

متن کامل

A Cuckoo Filter Modification Inspired by Bloom Filter

Probabilistic data structures are so popular in membership queries, network applications, and so on. Bloom Filter and Cuckoo Filter are two popular space efficient models that incorporate in set membership checking part of many important protocols. They are compact representation of data that use hash functions to randomize a set of items. Being able to store more elements while keeping a reaso...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994